1,307 research outputs found

    The sampling brain

    Get PDF
    Understanding the algorithmic nature of mental processes is of vital importance to psychology, neuroscience, and artificial intelligence. In response to a rapidly changing world and computational demanding cognitive tasks, evolution may have endowed us with brains that are approximating rational solutions, such that our performance is close to optimal. This thesis suggests one instance of the approximation algorithms, sample-based approximation, to be implemented by the brain to tackle complex cognitive tasks. Knowing that certain types of sampling is used to generate mental samples, the brain could also actively correct for the uncertainty comes along with the sampling process. This correction process for samples left traces in human probability estimates, suggesting a more rational account of sample-based estimations. In addition, these mental samples can come from both observed experiences (memory) and synthesised experiences (imagination). Each source of mental samples has unique role in learning tasks and the classical error-correction principle of learning can be generalised when mental-sampling processes are considered

    Probabilistic biases meet the Bayesian brain

    Get PDF
    Bayesian cognitive science sees the mind as a spectacular probabilistic inference machine. But Judgment and Decision Making research has spent half a century uncovering how dramatically and systematically people depart from rational norms. This paper outlines recent research that opens up the possibility of an unexpected reconciliation. The key hypothesis is that the brain neither represents nor calculates with probabilities; but approximates probabilistic calculations through drawing samples from memory or mental simulation. Sampling models diverge from perfect probabilistic calculations in ways that capture many classic JDM findings, and offers the hope of an integrated explanation of classic heuristics and biases, including availability, representativeness, and anchoring and adjustment

    Information seeking as chasing anticipated prediction errors

    Get PDF
    When faced with delayed, uncertain rewards, humans and other animals usually prefer to know the eventual outcomes in advance. This preference for cues providing advance information can lead to seemingly suboptimal choices, where less reward is preferred over more reward. Here, we introduce a reinforcement-learning model of this behavior, the anticipated prediction error (APE) model, based on the idea that prediction errors themselves can be rewarding. As a result, animals will sometimes pick options that yield large prediction errors, even when the expected rewards are smaller. We compare the APE model against an alternative information-bonus model, where information itself is viewed as rewarding. These models are evaluated against a newly collected dataset with human participants. The APE model fits the data as well or better than the other models, with fewer free parameters, thus providing a more robust and parsimonious account of the suboptimal choices. These results suggest that anticipated prediction errors can be an important signal underpinning decision making

    A Visualization method for machine translation evaluation results

    Get PDF
    PACLIC 20 / Wuhan, China / 1-3 November, 200

    Dynamics of quantum entanglement in the reservoir with memory effects

    Full text link
    The non-Markovian dynamics of quantum entanglement is studied by the Shabani-Lidar master equation when one of entangled quantum systems is coupled to a local reservoir with memory effects. The completely positive reduced dynamical map can be constructed in the Kraus representation. Quantum entanglement decays more slowly in the non-Markovian environment. The decoherence time for quantum entanglement can be markedly increased by the change of the memory kernel. It is found out that the entanglement sudden death between quantum systems and entanglement sudden birth between the system and reservoir occur at different instants.Comment: 14 pages, 3 figure

    U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning

    Full text link
    Zero-shot speaker cloning aims to synthesize speech for any target speaker unseen during TTS system building, given only a single speech reference of the speaker at hand. Although more practical in real applications, the current zero-shot methods still produce speech with undesirable naturalness and speaker similarity. Moreover, endowing the target speaker with arbitrary speaking styles in the zero-shot setup has not been considered. This is because the unique challenge of zero-shot speaker and style cloning is to learn the disentangled speaker and style representations from only short references representing an arbitrary speaker and an arbitrary style. To address this challenge, we propose U-Style, which employs Grad-TTS as the backbone, particularly cascading a speaker-specific encoder and a style-specific encoder between the text encoder and the diffusion decoder. Thus, leveraging signal perturbation, U-Style is explicitly decomposed into speaker- and style-specific modeling parts, achieving better speaker and style disentanglement. To improve unseen speaker and style modeling ability, these two encoders conduct multi-level speaker and style modeling by skip-connected U-nets, incorporating the representation extraction and information reconstruction process. Besides, to improve the naturalness of synthetic speech, we adopt mean-based instance normalization and style adaptive layer normalization in these encoders to perform representation extraction and condition adaptation, respectively. Experiments show that U-Style significantly surpasses the state-of-the-art methods in unseen speaker cloning regarding naturalness and speaker similarity. Notably, U-Style can transfer the style from an unseen source speaker to another unseen target speaker, achieving flexible combinations of desired speaker timbre and style in zero-shot voice cloning

    DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin

    Full text link
    While the performance of cross-lingual TTS based on monolingual corpora has been significantly improved recently, generating cross-lingual speech still suffers from the foreign accent problem, leading to limited naturalness. Besides, current cross-lingual methods ignore modeling emotion, which is indispensable paralinguistic information in speech delivery. In this paper, we propose DiCLET-TTS, a Diffusion model based Cross-Lingual Emotion Transfer method that can transfer emotion from a source speaker to the intra- and cross-lingual target speakers. Specifically, to relieve the foreign accent problem while improving the emotion expressiveness, the terminal distribution of the forward diffusion process is parameterized into a speaker-irrelevant but emotion-related linguistic prior by a prior text encoder with the emotion embedding as a condition. To address the weaker emotional expressiveness problem caused by speaker disentanglement in emotion embedding, a novel orthogonal projection based emotion disentangling module (OP-EDM) is proposed to learn the speaker-irrelevant but emotion-discriminative embedding. Moreover, a condition-enhanced DPM decoder is introduced to strengthen the modeling ability of the speaker and the emotion in the reverse diffusion process to further improve emotion expressiveness in speech delivery. Cross-lingual emotion transfer experiments show the superiority of DiCLET-TTS over various competitive models and the good design of OP-EDM in learning speaker-irrelevant but emotion-discriminative embedding.Comment: accepted by TASL
    corecore